Improving Example Based Machine Translation Through Morphological Generalization and Adaptation
نویسندگان
چکیده
Example Based Machine Translation (EBMT) is limited by the quantity and scope of its training data. Even with a reasonably large corpus, we will not have examples that cover everything we want to translate. This problem is especially severe in Arabic due to its rich morphology. We demonstrate a novel method that exploits the regular nature of Arabic morphology to increase the quality and coverage of machine translation. Through the use of generalization and rewrite rules, we are able to recover the English translation of phrases that do not exist in the training corpora. Furthermore, this system shows improvement in BLEU even with a training corpus of 1.4 million sentence pairs.
منابع مشابه
Arabic-to-English Example Based Machine Translation Using Context-Insensitive Morphological Analysis
W e describe and discuss the results of ongoing experim ents that use morphological analysis in the context of Example-Based M achine Translation. The goal is to increase the coverage of our training examples so as to capture things that are not directly seen in the training text. This is done through a two stage process of generalization and filtering.
متن کاملA Systematic Adaptation Scheme for English-Hindi Example-Based Machine Translation
The success of Example-Based Machine Translation (EBMT) often depends upon how efficient the adaptation scheme is. Adaptation primarily aims at modifying retrieved examples to meet the required demands of a given translation task. The present work looks at adaptation for EBMT from English to Hindi. This paper describes a rule-driven adaptation scheme for modifying a retrieved translation exampl...
متن کاملDomain Adaptation Through Phrase Generalization for Improved Statistical Machine Translation Quality
This paper presents a method for domain adaptation (incorporating out-of-domain data) through phrase generalization (learning/using phrase templates) in order to improve the Italian-English translation quality on the BTEC travel task. The process of phrase generalization is described, and its inclusion in the system resulted in noticeable, but only minor improvements because of alignment proble...
متن کاملMonolingual Machine Translation The Tenth Biennial Conference of the Association for Machine Translation in the Americas AMT 2012 20 A Years Tsuyoshi Okita
This paper presents a detailed study of a method for morphology generalization and generation to address out-of-domain translations in English-to-Spanish phrase-based MT. The paper studies whether the morphological richness of the target language causes poor quality translation when translating out-ofdomain. In detail, this approach first translates into Spanish simplified forms and then predic...
متن کاملA Speci c Least General Generalization of Strings and Its Application to Example Based Machine Translation
Since the least general generalization LGG of strings may cause an over generalization in the generalization process of clauses we propose a speci c least general generalization SLGG of strings to reduce over generalization To create a SLGG of two strings rst a minimal match sequence between these strings is found A minimal match sequence of two strings consists of similarities and di erences t...
متن کامل